torchao: safetensors save/load + disk group offload (closes #13713) by itzzdeep · Pull Request #13721 · huggingface/diffusers

itzzdeep · 2026-05-11T21:21:31Z

What does this PR do?

Description

Adds safetensors save/load and disk-based group offloading for TorchAO-quantized models. Implemented via 4 no-op hooks on DiffusersQuantizer, overridden by TorchAoHfQuantizer:

get_state_dict_and_metadata — flatten subclasses on save
set_metadata — read torchao header from each shard on load
update_loaded_keys — collapse _weight_* suffixes to canonical names
update_state_dict_with_metadata — per-shard unflatten with cross-shard buffer

Group offload reuses the same path. No torchao-specific code in modeling_utils. Requires torchao ≥ 0.15.0 and version=2 configs; v1 raises a ValueError naming the fix.

Results

End-to-end verified on Lumina-Image-2.0 (2.61B params, A100, Int8 weight-only).

GPU memory	Baseline	Disk group offload
Peak during forward	2.53 GB	0.16 GB
Resident between forwards	2.47 GB	0.009 GB

Save → reload is bit-perfect (cosine_dist = 0.0). Forward latency cost: 87 ms → 1592 ms.

Tests

TorchAoSerializationTest 7/7 + test_group_offloading.py 38/38 pass. Real Flux.1-Dev save/reload works; disk group offload on Flux is blocked by pre-existing torchao bugs

[x] Did you write any new necessary tests?
Added test_safetensors_save_int_a16w8 and test_group_offload_to_disk_int_a16w8 in tests/quantization/torchao/test_torchao.py

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul

…ce#13713)

sayakpaul · 2026-05-12T00:45:42Z

Thanks for your contributions. However, #13719 was opened before this one, so we will go with #13719. I am sure there will be future contribution opportunities.

wadeKeith

Solid improvement - safetensors save/load plus disk group offload for torchao. Properly closes #13713. Good test coverage in the quantizer module. LGTM! Reviewed by Hermes Agent.

torchao: safetensors save/load + disk group offload (closes huggingfa…

41f1eb3

…ce#13713)

github-actions Bot added size/L PR with diff > 200 LOC fixes-issue quantization models tests hooks labels May 11, 2026

wadeKeith reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torchao: safetensors save/load + disk group offload (closes #13713)#13721

torchao: safetensors save/load + disk group offload (closes #13713)#13721
itzzdeep wants to merge 1 commit into
huggingface:mainfrom
itzzdeep:torchao-safetensors

itzzdeep commented May 11, 2026

Uh oh!

sayakpaul commented May 12, 2026

Uh oh!

wadeKeith left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

itzzdeep commented May 11, 2026

What does this PR do?

Description

Results

Tests

Who can review?

Uh oh!

sayakpaul commented May 12, 2026

Uh oh!

wadeKeith left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants